Search results for "Temporal difference learning"

showing 4 items of 4 documents

Least-squares temporal difference learning based on an extreme learning machine

2014

Abstract Reinforcement learning (RL) is a general class of algorithms for solving decision-making problems, which are usually modeled using the Markov decision process (MDP) framework. RL can find exact solutions only when the MDP state space is discrete and small enough. Due to the fact that many real-world problems are described by continuous variables, approximation is essential in practical applications of RL. This paper is focused on learning the value function of a fixed policy in continuous MPDs. This is an important subproblem of several RL algorithms. We propose a least-squares temporal difference (LSTD) algorithm based on the extreme learning machine. LSTD is typically combined wi…

Mathematical optimizationArtificial neural networkArtificial IntelligenceCognitive NeuroscienceBellman equationReinforcement learningState spaceMarkov decision processTemporal difference learningComputer Science ApplicationsMathematicsExtreme learning machineCurse of dimensionalityNeurocomputing
researchProduct

Prediction error signal correlates with fluid intelligenceand dopamine synthesis across the lifespan

2011

IntroductionFluid intelligence expresses the capacity for interpretation of novel stimuli and flexible behavioral adaptation to such cues. Phasic dopamine firing closely matches a temporal difference prediction error (PE) signal important for learning and rapid behavioral adaptation. Both fluid intelligence and dopaminergic neurotransmission decline with age. So far, no study investigated the relationship between fluid IQ, PE signal and direct measures of dopaminergic neurotransmission. Here we used a multimodal imaging approach that combines positron emission tomography and functional magnetic resonance imaging.MethodsA group of healthy controls was investigated with both 6-[18F]FluoroDOPA…

medicine.diagnostic_testPutamenVentral striatumPsychiatry and Mental healthmedicine.anatomical_structureDopamineBasal gangliamedicineFluorodopaTemporal difference learningPsychologyFunctional magnetic resonance imagingPrefrontal cortexNeurosciencemedicine.drugEuropean Psychiatry
researchProduct

Temporal difference method for processing dynamic speckle patterns

2010

A temporal difference method for processing dynamic speckle images is proposed. In the method two speckle images of an object, separated by a time interval, are subtracted one from the other to detect whether the speckle structure has changed or not. The rationale of the method is discussed. A variant of the method that allows measuring the area of an activity zone surrounded by a static region is tested in digital simulations. As a demonstrative experiment, that variant is employed to characterize the drying of a damp patch in filter paper.

Dynamic speckleComputer sciencebusiness.industrySpeckle noiseImage processingInterval (mathematics)Atomic and Molecular Physics and OpticsElectronic Optical and Magnetic MaterialsSpeckle patternOpticsElectronic speckle pattern interferometryElectrical and Electronic EngineeringPhysical and Theoretical ChemistryTemporal difference learningbusinessOptics Communications
researchProduct

Using Inverse Reinforcement Learning with Real Trajectories to Get More Trustworthy Pedestrian Simulations

2020

Reinforcement learning is one of the most promising machine learning techniques to get intelligent behaviors for embodied agents in simulations. The output of the classic Temporal Difference family of Reinforcement Learning algorithms adopts the form of a value function expressed as a numeric table or a function approximator. The learned behavior is then derived using a greedy policy with respect to this value function. Nevertheless, sometimes the learned policy does not meet expectations, and the task of authoring is difficult and unsafe because the modification of one value or parameter in the learned value function has unpredictable consequences in the space of the policies it represents…

0209 industrial biotechnologyreinforcement learningComputer scienceGeneral Mathematics02 engineering and technologypedestrian simulationTask (project management)learning by demonstration020901 industrial engineering & automationAprenentatgeInformàticaBellman equation0202 electrical engineering electronic engineering information engineeringComputer Science (miscellaneous)Reinforcement learningEngineering (miscellaneous)business.industrycausal entropylcsh:MathematicsProcess (computing)020206 networking & telecommunicationsFunction (mathematics)inverse reinforcement learninglcsh:QA1-939Problem domainTable (database)Artificial intelligenceTemporal difference learningbusinessoptimizationMathematics
researchProduct